Deep learning image reconstruction generates thinner slice iodine maps with improved image quality to increase diagnostic acceptance and lesion conspicuity: a prospective study on abdominal dual-energy CT

Background To assess the improvement of image quality and diagnostic acceptance of thinner slice iodine maps enabled by deep learning image reconstruction (DLIR) in abdominal dual-energy CT (DECT). Methods This study prospectively included 104 participants with 136 lesions. Four series of iodine maps were generated based on portal-venous scans of contrast-enhanced abdominal DECT: 5-mm and 1.25-mm using adaptive statistical iterative reconstruction-V (Asir-V) with 50% blending (AV-50), and 1.25-mm using DLIR with medium (DLIR-M), and high strength (DLIR-H). The iodine concentrations (IC) and their standard deviations of nine anatomical sites were measured, and the corresponding coefficient of variations (CV) were calculated. Noise-power-spectrum (NPS) and edge-rise-slope (ERS) were measured. Five radiologists rated image quality in terms of image noise, contrast, sharpness, texture, and small structure visibility, and evaluated overall diagnostic acceptability of images and lesion conspicuity. Results The four reconstructions maintained the IC values unchanged in nine anatomical sites (all p > 0.999). Compared to 1.25-mm AV-50, 1.25-mm DLIR-M and DLIR-H significantly reduced CV values (all p < 0.001) and presented lower noise and noise peak (both p < 0.001). Compared to 5-mm AV-50, 1.25-mm images had higher ERS (all p < 0.001). The difference of the peak and average spatial frequency among the four reconstructions was relatively small but statistically significant (both p < 0.001). The 1.25-mm DLIR-M images were rated higher than the 5-mm and 1.25-mm AV-50 images for diagnostic acceptability and lesion conspicuity (all P < 0.001). Conclusions DLIR may facilitate the thinner slice thickness iodine maps in abdominal DECT for improvement of image quality, diagnostic acceptability, and lesion conspicuity. Supplementary Information The online version contains supplementary material available at 10.1186/s12880-024-01334-0.

Deep learning image reconstruction generates thinner slice iodine maps with improved image quality to increase diagnostic acceptance and lesion conspicuity: a prospective study on abdominal dualenergy CT Background Dual-energy CT (DECT) allows further material decomposition analysis to generate iodine maps for quantifying the presence of iodine-containing contrast material [1].The iodine mapping is desirable to be used as a imaging biomarker for the detection of vascular emboli, characterization of lesions, and evaluation of treatment response [2][3][4].The current application of iodine maps is mainly based on established iodine concentration thresholds; therefore, the accuracy and consistency of iodine concentration measurements has attracted attention of many researchers [5][6][7][8].However, the application and investigation of iodine maps for diagnosis is hindered by low spatial resolution and severe image noise.The slice thickness of 5-mm and 1.25-mm is used for diagnosis in current clinical routine [9], with the 5-mm being more common for the abdomen due to the patient radiation concern.The 5-mm images display with lower image noise; however, may suffer from the low spatial resolution due to partial volume effect.In contrast, the 1.25-mm images enable higher spatial resolution for detection of more lesion details but the signal-to-noise ratio would be reduced.
It is thus very important to reconstruct iodine maps in DECT into thinner slices but with reduced image noise for iodine maps to gain clinical acceptance as interpretative images.Deep learning image reconstruction (DLIR), an algorithm uses deep convolutional neural networks, has been developed and tested for improving image quality in virtual monochromatic images (VMIs) with reduced radiation and iodine dose [10][11][12][13][14][15][16][17][18], and has also shown higher accuracy in iodine concentration measurements [5][6][7][8].Nevertheless, it has not been fully investigated whether the DLIR can improve image quality of iodine maps with thinner slice thickness to increase diagnostic confidence [19].
Therefore, this prospective study is aimed to assess the hypothesis that the use of thin slice thickness (1.25-mm) iodine maps combined with DLIR in abdominal DECT can provide lower image noise, higher spatial resolution, to diagnostic acceptance and lesion conspicuity, in comparison with iodine maps reconstructed using the state-of-the-art reconstruction algorithm of adaptive statistical iterative reconstruction-V (Asir-V).

Methods
The local institutional ethic review board approved this study, and the written informed consents from all participants have been received.The Fig. 1 presents the workflow of current study.

Participants
This study prospectively screened 110 consecutive participants who underwent standard abdominal contrastenhanced CT scan for clinical purpose at our institution between March and July 2022 using a DECT scanner (Revolution Apex CT, GE Healthcare).The inclusion criteria were: (a) ≥ 18 years old; (b) scheduled to undergo abdominal contrast-enhanced CT scan for clinical purpose; (c) agree to participate the study.The exclusion criteria were: (a) severe artifacts; (b) incomplete images series or reconstruction failure; (c) lack of anatomical structure for evaluation.There were six participants excluded for: incomplete images series (n = 3), severe motion artifacts (n = 2), and splenectomy (n = 1), respectively.Accordingly, 104 participants were included in the study.One radiologist with 4-year-experience in radiology identified 136 lesions for analysis.The lesions were selected if they were detectable on the portal-venous scans.The largest lesion was selected when there were multiple lesions detectable in the same organ.These lesions were confirmed by an abdominal radiologist with 20-year-experience in abdominal radiology.

Image acquisition and reconstruction
The abdominal contrast-enhanced DECT scans were conducted using a typical abdominal protocol.The DECT scans were performed using a fast-kilovoltage-switching technique between 80 and 140 kVp, with an automatic tube current (GSI Assist, GE Healthcare), 0.625 × 128 mm of beam collimation, 0.8 s of rotation time, 0.992:1 of pitch factor, 50 cm of scan field of view, 38 cm of display field of view, and 512 × 512 of image matrix.A nonionic contrast media was administered intravenously (approximately 520 mgI/kg body weight) within a fixed duration of 30 s.The real-time fluoroscopic monitoring (140 kVp, 20 mA) was initiated 10 s after the contrast injection.After the bolus-tracking program (SmartPrep, GE Healthcare) detected an enhancement over the threshold of 140 HU in the abdominal aorta, the arterial, portalvenous, and equilibrium scans were initiated with additional delays of 12, 30, and 100 s.The delay duration in this protocol was optimized by the local institution.
The raw data of portal-venous scans were used for reconstruction.There were four series of iodine maps for each participant reconstructed using a vendor-specific workstation (advantage workstation, AW version 4.7, GE Healthcare): 5-mm (thick slice) and 1.25-mm (thin slice) using Asir-V with 50% blending (AV-50), and 1.25-mm using DLIR with medium (DLIR-M), and high strength (DLIR-H), all applying the standard kernel.The slice thickness and blending ratio of Asir-V is determined according to the institutional clinical routine.The 5-mm and 1.25-mm AV-50 images were used as reference for image spatial resolution and noise, respectively.
Our pilot study suggested that only the 1.25-mm DLIR-M and DLIR-H images have potential in improving both image noise and spatial resolution (Supplementary Note S1).However, the DLIR-L did not show advantage compared to the routine 1.25-mm or 5-mm AV-50 images.Therefore, the 1.25-mm DLIR-L images were not reconstructed.

Quantitative image evaluation
A radiologist with 4-year-experience in radiology conducted the quantitative image evaluation using the opensource imQuest software version 7.1 (Duke University; https://deckard.duhs.duke.edu/~samei/tg233.html)(Supplementary Note S2) [5,8,9,13].Regions of interest (ROI) were selected on the 5 mm AV-50 images.The corresponding images in thinner slices in other reconstructions were linked, to ensure the identical ROIs were used for evaluation.The iodine concentration (IC) and its standard deviation (SD) of nine anatomic structures were measured.The coefficient of variation (CV) for each structure was calculated to evaluate the extent of variability by dividing the mean IC by the SD over the patient cohort.The image noise magnitude was evaluated using noise power spectrum (NPS) by placing ROIs on the relatively homogenous regions of the liver.The NPS curve, NPS noise, NPS noise peak, the average spatial frequency and the peak spatial frequency were generated.The image sharpness was evaluated using edge rise slope (ERS).The ERS was calculated by dividing the IC difference between the last dip and the first peak on the rising IC curve by the distance between these two points.The ERS was measured using a selected axial plane that presents a running portal vein.Ten participants were randomly selected for repeated measurements by the same radiologist and by another radiologist with 5-yearexperience in radiology two weeks after the first readout.These repeated measurements were used to calculate the intra-in inter-rater quantitative measurement variabilities, respectively.

Qualitative image evaluation
Five radiologists with 1-to 6-year-experience in radiology performed the qualitative image quality assessment (Supplementary Note S3) [6,7,9,[11][12][13][14][15][16]18].The readers independently rated image quality in terms of image noise, contrast, sharpness, texture, small structure visibility, and evaluated overall diagnostic acceptability of images and lesion conspicuity.A value of more than 3 was defined as an image quality satisfactory for clinical use.The readers should detect the lesions for evaluation by themselves but were instructed what lesions to rate.All the images were randomly presented to the readers without reconstruction parameters.The images were shown with a window width of 15.0 mgI/mL and window level of 5.0 mgI/mL, using the same settings for daily image interpretation at the reading room.There were no time limits for the evaluation, and the readers can view the images with the window width and level as well as distance they like.Two weeks after the first readout, a radiologist with 5-year-experience in radiology repeated all the assessment of images and lesions.The intra-in inter-rater qualitative assessment variabilities were calculated using assessments of five raters and the repeated assessments by this rater, respectively.

Statistical analysis
We applied R language version 4.1.3(https://www.r-project.org/) with related packages within RStudio version 1.4.1106(https://www.rstudio.com/)(Supplementary Note S4) [20] for statistical analysis.The difference of among the reconstruction algorithms were performed by using repeated-measure analysis of variance (ANOVA) for quantitative metrics and Friedman test for qualitative metrics, respectively.Once there was an overall significant difference, the post hoc pairwise comparisons between groups would be conducted with Bonferroni correction.For the lesion conspicuity, subgroup analysis was performed according to (1) location of lesion, (2) the largest diameter of lesions, and (3) the presentation of lesions compared to surrounding tissue.The statistical analysis was two-tailed and the alpha value was set at 0.05.The agreements of quantitative evaluations were evaluated by using an intraclass correlation coefficient (ICC) of single measurement, absolute agreement, and two-way random-effects model [21].For qualitative evaluations, the intra-rater and inter-rater agreements were evaluated by using the weighted kappa statistic for and the Kendall's W statistics, respectively [22].According to our pilot study, the a priori sample size estimation yielded a size of 22 participants for a power of 0.85, when alpha was 0.05.With 104 participants, the post hoc power calculation resulted in 1-beta values > 0.995, when alpha was 0.05, indicating an efficient statistical power [23].

Quantitative evaluation results
The intra-rater and inter-rater agreements were excellent for IC of liver (ICC 0.938-0.939)and SD of liver IC (ICC 0.921-0.967)measurements (Supplementary Table S2).The effect sizes of these quantitative evaluation metrics between the different reconstruction algorithms are calculated (Supplementary Table S3).The IC values remained stable among the four series with different reconstruction algorithms (all p > 0.05) (Table 2; Fig. 2).The CV values of nine anatomical sites on the thin slice DLIR-M and DLIR-H images were lower than that on the thin slice AV-50 (all p < 0.001), with that of the thin DLIR-H images being the lowest among the three groups.The CV values of kidney, psoas major, and abdominal subcutaneous fat on the thin slice DLIR-H were similar to that on the thick slice AV-50 images (all p > 0.999), but those of other anatomical sites were slightly higher (all other p < 0.001).
The intra-rater and inter-rater agreements were moderate to excellent for NPS metrics (ICC 0.731-0.960)and ERS (ICC 0.933-0.964)measurements (Supplementary Table S2).The noise on the thin slice DLIR-H images were lower than thin slice AV-50 (p < 0.001), and thin slice DLIR-M images (p < 0.001), but were higher than that on the thick slice AV-50 images (p < 0.001) (Table 3; Fig. 3).The noise peaks of DLIR-H images were similar to that of the thick slice AV-50 images (p = 0.150).The peak spatial frequency and average spatial frequency varied among the four reconstruction algorithms, except for the peak spatial frequency between thin slice AV-50, and thin slice DLIR-H images (p = 0.210).However, the difference of the peak and average spatial frequency among the four reconstruction algorithms were relatively small.The ERS values on the thin slice AV-50, thin slice DLIR-M, and thin slice DLIR-H images were higher than that on thick slice AV-50 images (all P < 0.001), but the difference in ERS values among the thin slice images could not be identified (all p > 0.999).

Qualitative evaluation results
The intra-rater and inter-rater agreements were moderate to good (weighted kappa statistic 0.546-0.769;Kendall's W statistic 0.536-0.717)for qualitative analysis (Supplementary Table S2).The effect sizes of these qualitative evaluation metrics between the different reconstruction algorithms are calculated (Supplementary Table S4).The thin slice DLIR-H images were rated the highest for image noise (p < 0.001) and presented comparable ratings for small structure visibility as the thin slice DLIR-M images (p > 0.999) (Table 4; Fig. 4).The thin slice DLIR-M images were rated the highest in terms of image contrast, sharpness, and texture, and gained the highest diagnostic acceptance among four image reconstruction algorithms (all p < 0.001).The thin slice AV-50 images suffered from undiagnostic image noise, while the thick slice AV-50 only showed advantages in image noise.Both of the AV-50 images were less acceptable for diagnosis than DLIR images (all p < 0.001).
For lesion conspicuity, the thin slice images showed higher acceptance than the thick slice AV-50 images by five readers (all p < 0.001) (Table 5; Fig. 4).Among the thin slice images, the thin slice DLIR-M gained higher rating than the thin slice AV-50 and thin slice DLIR-H (both p < 0.001).The subgroup analysis for lesion conspicuity suggested that the thin slice DLIR-M images were the most acceptable for the readers, while the thin slice DLIR-H images did not show significant improvements in lesion conspicuity compared to thin slice AV-50.The representative cases for lesion characterization are shown in Figs. 5 and 6, and Supplementary Figure S1.

Discussion
The current study evaluated the improvements in image quality, diagnostic acceptance, and lesion conspicuity of using thinner slice iodine maps combined with DLIR algorithm.The thin slice DLIR images provided stable IC measurement compared to the conventional Asir-V image reconstruction algorithm and showed lower CV values than that of thin slice AV-50 to allow accurate and consistent iodine quantification.The thin slice DLIR significantly reduced the image noise compared to thin slice AV-50, while provided higher spatial resolution with thinner slice thickness compared to thick slice AV-50.The subjective evaluation showed higher diagnostic acceptance and higher lesion conspicuity with thin slice DLIR images compared to thick slice AV-50 images, indicating the potential of thin slice iodine maps with DLIR for clinical diagnostic purpose.
The previous phantom studies have demonstrated that DECT scanners using a fast-kilovoltage-switching mode with DLIR can provide possible small improvement in iodine quantification accuracy compared with the Asir-V [5,8].The clinical study further confirmed the potential of DLIR in reducing image noise as well as variability of IC values compared to Asir-V [6][7][8].However, the studies only investigated the IC accuracy and image quality at one slice thickness of 5-mm [5] or thin slice [6][7][8].As the current clinical standard of obtaining iodine maps is still iterative reconstruction with relative thick slice thickness, we further investigated the influence of slice thickness on the iodine quantification.Our results suggested that the DLIR allows thinner slice thickness with consistent IC The thicker slice images have low image noise, but usually presents with lower spatial resolution and suffer from partial volume effects.This led to difficulties in displaying small and low-density objects.In contrast, the improved ERS and spatial resolution can provide higher sharpness and better contrast to allow better detectability of lesions.However, the thinner slice images have potential for improving the conspicuity for these lesions, but increase the image noise [9].In our study, the thin slice AV-50 showed higher ERS values compared to thick slice AV-50, but suffered from the increased image noise, Note Data were presented as mean ± standard deviation.p = p value for repeated-measure analysis of variance.p1 to p6 were p values for post hoc pairwise comparisons using Bonferroni correction, the p values presented were presented as adjusted p values which resulted in suboptimal clinical acceptance evaluated by five radiologists.As the thinner slice thickness with AV-50 cannot provide satisfied balanced image quality for lesion detection, the new DLIR algorithm was introduced.To overcome the dilemma of spatial resolution and image noise, the DLIR algorithm was used and presented potential for improving image quality in VMIs [10][11][12][13][14][15][16][17][18].The DLIR algorithm is developed by using deep convolutional neural networks with a ground-truth training data of filtered back-projection images acquired with high-dose scans, to generate high quality images from low-dose scans.The reduced image noise is believed to yield lower variability in the measured IC values [8].
The thin slice DLIR images presented similar ERS values compared to thin slice AV-50, while maintained relatively low image noise compared to thick slice AV-50, which gained higher clinical acceptance in subjective evaluation.Therefore, we believed that the DLIR may facilitate a thinner slice thickness as a new state-of-art standard for routine reconstruction of iodine maps in abdominal DECT, to replace the original thicker slice iodine maps using Asir-V.
Our study has following limitations to address.First, the current study was conducted with a relatively small sample size at one institution.Although post hoc power calculation showed high efficiency, our conclusions require further validation in other centers.Second, our study only employed only one fast-kilovoltage-switching DECT scanner since the DLIR algorithm is vendorspecific, and we only compared the vendor-specific Asir-V algorithm with DLIR-M and DLIR-H.The intervendor and inter-scanner differences were not assessed [5,[24][25][26][27].However, we chose thick slice AV-50 iodine maps as the benchmark, to present the improvement accomplished by DLIR compared to the current clinical standard.The DLIR with low strength was not included because it is not hopeful to provide available image quality [17,18].Third, our study only measured the IC values of normal structures.The influence of DLIR on iodine quantification and diagnosis must be ascertained with respect to different diseases.Also, the potential influence of DLIR on advanced quantitative analysis was not evaluated [5,25].Fourth, the qualitative image evaluation in our study was conducted by five radiologists with 1-to 6-year-experience in radiology.The experience in radiologists may introduce bias in the rating.The results of our study should be validated by more studies with more radiologists with different levels of experience.Fifth, the diagnostic acceptance of using iodine maps was not compared with that of the VMIs, as the best kiloelectron voltage level for VMIs using DLIR has not been determined yet.Further comparisons between VMIs and iodine maps are necessary to tell whether iodine maps have potential advantages for diagnosis purpose.Sixth, the potential influence of factors like patient motion, contrast agent dosage, and scanner settings on image quality were not assessed in our study.The future study may focus on these factors to deepen the DLIR application in clinical practice.Finally, the DLIR algorithm is a black box.We need further investigation to gain acceptance in clinical practice.Further investigations on the its robustness to artifacts and noise [28][29][30][31], as well as its protentional influence on the later images processing steps [32][33][34].The future investigations are encouraged to explore the impact of DLIR on specific types of lesions or comparing its performance across different patient populations.Moreover, the cost-effectiveness of implementing DLIR in clinical practice would also be an interesting topic.
To summarize, the thinner slice thickness iodine maps with DLIR in abdominal DECT can keep the iodine concentration measurement values unchanged with lower variability compared with the standard reconstructions to allow consistent quantitative iodine analysis using established threshold values, and can provide improved image quality with reduced image noise, more natural image texture, and better spatial resolution.Compared to the standard thicker slice reconstructions, the thinner slice thickness iodine images with DLIR have the potential can potentially improve the accuracy of lesion detection and characterization in abdominal DECT.Future studies are encouraged to determine whether DLIR has clinical impact on iodine quantification and diagnosis confidence for specific clinical tasks.

Fig. 2
Fig. 2 Iodine concentration and variability.(A) IC values, (B) SD values, and (C) CV values of nine anatomical sites obtained using the four different image reconstruction algorithms

Fig. 3 5
Fig. 3 NPS and ERS assessment.(A) NPS curves obtained with the four different image reconstruction algorithms using homogenous region of the liver of a patient.(B) The NPS and ERS metrics among the four different image reconstruction algorithms.n. s. indicates non-significant post hoc comparison results using Bonferroni method

Fig. 4
Fig. 4 Qualitative image quality and lesion conspicuity rating.Bar plots for image noise, image contrast, image sharpness, small structure visibility, diagnostic acceptability of participants (n = 104), and lesion conspicuity of evaluated lesions (n = 136) obtained with four image reconstruction algorithms.n. s. indicates non-significant post hoc comparison results using Bonferroni method

Fig. 5
Fig. 5 Four examples of abdominal CT studies.These portal-venous phase abdominal CT studies were reconstructed into iodine maps using 5-mm AV-50, 1.25-mm AV-50, 1.25-mm DLIR-M, and 1.25-mm DLIR-H algorithms; and present with the same windowing (width of 15.0 mgI/mL, level of 5.0 mgI/mL).(A) A patient underwent CT scan for cancer staging.A low-density hepatic lesion was better visualized in 1.25-mm images than the 5-mm AV-50 image.The 1.25-mm DLIR image better balanced the image noise and spatial resolution.(B) A patient underwent CT scan for cancer staging.The enhanced hepatic lesion can be detected in all the images, while the details of the texture and boundary were better visualized in 1.25-mm images.The lesion of the stomach can be detected in all the images, but the enhanced range was better displayed with lower image noise in 1.25-mm DLIR-H images, potentially allowed more accurate staging.(C) A patient underwent CT scan for hepatic lesion.Compared with the 5-mm AV-50 image, the 1.25-mm images better showed the enhance pattern of the lesion.The 1.25-mm DLIR-H images presented the lesion with more details and lower image noise.(D) A patient underwent CT scan for hepatic lesion.A hepatic lesion with surrounding enhancement was detected.The lesion texture of is better depicted with 1.25-mm images.The 1.25-mm DLIR-H images best balanced the image noise, and spatial resolution

Fig. 6
Fig.6 Four examples of abdominal CT studies with measurements.These portal-venous phase abdominal CT studies were reconstructed into iodine maps using 5-mm AV-50, 1.25-mm AV-50, 1.25-mm DLIR-M, and 1.25-mm DLIR-H algorithms, respectively; and present with the same windowing (width of 15.0 mgI/mL, level of 5.0 mgI/mL).(A) A patient was scanned for a suspected left renal lesion.With 1.25-mm images, the internal enhancement was better detected.The difference of iodine concentration between the low-density compartment and subtle enhancement was more obvious with 1.25mm AV-50 (1.14 mgI/mL) and 1.25-mm DLIR-H (1.04 mgI/mL) images, than 5-mm AV-50 (0.68 mgI/mL) image, which suffered from the partial volume effects.(B) A patient was scanned for cancer staging.The hepatic lesion with subtle circular and center enhancement in with 5-mm AV-50 image was suspected to be a metastasis lesion.The 1.25-mm images better visualized the boundary of lesion, and the subtle circular enhancement was excluded.The detailed center enhancement in the lesion can be better depicted in 1.25-mm images.The ERS of border of the enhancement was higher in 1.25-mm AV-50 [0.437 (mgI/mL)/mm] image than 5-mm AV-50 [0.262 (mgI/mL)/mm] image, and was even higher in 1.25 DLIR-H [0.655 (mgI/mL)/mm] image.The follow-up scans indicated that the lesion is benign.(C) A patient was scanned for a lesion in spleen.The difference of iodine concentration between the low-density lesion and spleen parenchyma was more obvious with 1.25-mm AV-50 (1.14 mgI/mL) and 1.25-mm DLIR-H (1.04 mgI/mL) images, than 5-mm AV-50 (0.60 mgI/mL) image.The ERS of the lesion boundary was higher in 1.25-mm AV-50 [0.334 (mgI/mL)/mm] and 1.25 DLIR-H [0.359 (mgI/mL)/mm] images than 5-mm AV-50 [0.164 (mgI/mL)/mm] image.The 1.25-mm DLIR-H image with lower noise improved the diagnostic confidence of a cyst.(D) A patient with gastric cancer was scanned for hepatic metastases.With the 1.25-mm images, the left boundary of the lesion was clearer than the 5-mm AV-50 images.The ERS of the subtle left boundary was higher in in 1.25-mm DLIR-H [0.586 (mgI/mL)/mm] image than 1.25 AV-50 [0.373 (mgI/mL)/mm] image, and lowest in 5-mm AV-50 [0.1.66(mgI/mL)/mm] image.It was hard to measure the size of the lesion in 5-mm image, while it can be easier to be measured in 1.25-mm images, to guide later treatment selections

Table 1
Participant and lesion characteristics [2][3][4]nt application of iodine maps is mainly based on established iodine concentration thresholds[2][3][4].Our study suggested that the DLIR can be safely accepted as a new reconstruction algorithm for quantitative analysis of abdomen.

Table 2
Iodine concentration and variability

Table 3
NPS and ERS assessment